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17 Abstract 

18 Aim: 

19 The Maximum Entropy Theory of Ecology (METE) is a unified theory of biodiversity that 

20 attempts to simultaneously predict patterns of species abundance, size, and spatial structure. The 

21 spatial predictions of this theory have repeatedly performed well at predicting diversity patterns 

22 across scales. However, the theoretical development and evaluation of METE has focused on 

23 predicting patterns that ignore inter-site spatial correlations. As a result the theory has not been 

24 evaluated using one of the core components of spatial structure. We develop and test a semi- 

25 recursive version of METE' s spatially explicit predictions for the distance decay relationship of 

26 community similarity and compare METE's performance to the classic random placement model 

27 of completely random species distributions. This provides a better understanding and stronger 

28 test of METE's spatial community predictions. 

29 Location: 

30 New world tropical and temperate plant communities. 

31 Methods: 

32 We analytically derived and simulated METE's spatially explicit expectations for the Sorensen 

33 index of community similarity. We then compared the distance decay of community similarity of 

34 16 mapped plant communities to METE and the random placement model. 

35 Results: 

36 The version of METE we examined was successful at capturing the general functional form of 

37 empirical distance decay relationships, a negative power function relationship between 

38 community similarity and distance. However, the semi-recursive approach consistently over- 
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39 predicted the degree and rate of species turnover and yielded worse predictions than the random 

40 placement model. 

41 Main conclusions: 

42 Our results suggest that while METE's current spatial models accurately predict the spatial 

43 scaling of species occupancy, and therefore core ecological patterns like the species-area 

44 relationship, its semi-recursive form does not accurately characterize spatially-explicit patterns 

45 of correlation. More generally, this suggests that tests of spatial theories based only on the 

46 species-area relationship may appear to support the underlying theory despite significant 

47 deviations in important aspects of spatial structure. 
48 

49 
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50 Introduction 

51 Community structure can be characterized using a variety of macroecological relationships 

52 such as the species-abundance, body size, and species spatial distributions. Increasingly 

53 ecologists have recognized that many of these macroecological patterns are inter-related, and 

54 progress has been made toward unifying the predictions of multiple patterns using theoretical 

55 models (Storch et ah, 2008; McGill, 2010). One approach to predicting suites of macrecological 

56 patterns are process-based models such as niche and neutral dispersal models, which have the 

57 potential to provide biological insight into the process structuring ecological systems (Adler et 

58 al, 2007). Alternatively, a new class of constraint-based models suggest that similar patterns 

59 may be produced by different sets of processes because the form of the predicted pattern is due 

60 to the existence of statistical constraints rather than directly reflecting detailed biological 

61 processes (Frank, 2009, 2014; McGill & Nekola, 2010; Locey & White, 2013). 

62 The Maximum Entropy Theory of Ecology (METE) is a recent attempt to explain a 

63 number of ecological patterns from the statistical constraint perspective (Harte et ah, 2008, 2009; 

64 Harte, 201 1 ; Harte & Newman, 2014). METE uses the principle of entropy maximization, that 

65 the most likely distribution is the one with the least information (i.e., the one closest to the 

66 uniform distribution) subject to a set of constraints (i.e., prior information), to predict 

67 distributions of species abundance, body size, and spatial structure. A frequentist perspective on 

68 the Maximum Entropy modeling approach is that every possible configuration of a system is 

69 equally likely; therefore, the probability of a particular distribution is directly proportional to the 

70 number of configurations that distribution is compatible with (Harte, 201 1 ; Harte & Newman, 

71 2014). The distribution with the largest number of compatible system configurations is the 

72 predicted most likely state of the system. In contrast to detailed biological models of community 
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73 assembly, METE has no free parameters and only requires information on total community area, 

74 total number of individuals, total number of species, and total metabolic rate of all individuals to 

75 generate its predictions. 

76 There is strong empirical support for METE's predictions for the species abundance 

77 distribution and patterns related to the spatial distribution of individuals and species (Harte et al. , 

78 2008, 2009; Harte, 2011; White etal, 2012a; Xiao etal, 2013; McGlinn et al, 2013; Newman 

79 et al, 2014). Specifically, METE has been successful at predicting spatially implicit patterns of 

80 community structure such as the species spatial abundance distribution and the species-area 

81 relationship (Harte et al, 2008, 2009; McGlinn et al, 2013). It has even been proposed that the 

82 METE spatial predictions yield a widely applicable universal species-area relationship (Harte et 

83 al, 2009, 2013, but see Sizling et al, 201 1, 2013). However all of METE's spatial predictions 

84 that have been tested focus on spatially implicit patterns that ignore spatial correlations. As a 

85 result the theory has not been evaluated using one of the core components of spatial structure. 

86 This is due in part to the fact that METE's spatial correlation predictions have not been fully 

87 derived. 

88 The most commonly studied ecological pattern that relies on these spatial correlations is 

89 the distance decay relationship (DDR) in which the similarity of species composition decreases 

90 with distance (Nekola & White, 1999). The DDR provides a spatially-explicit, community-level 

91 characterization of intra- specific aggregation patterns including correlations in space (Plotkin & 

92 Muller-Landau, 2002; Palmer, 2005; Morion et al, 2008; McGlinn & Palmer, 201 1), and 

93 predicting the DDR is an important area of future development for METE because the DDR is 

94 necessary to accurately extrapolate community patterns to unsampled areas (Harte, 201 1). 
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95 Here we explore METE's spatially explicit predictions for the DDR by developing 

96 analytical and simulation based solutions and comparing them to empirical data. We build on the 

97 Hypothesis of Equal Allocation Probabilities (HEAP, Harte et al. 2005, Harte 2007) using an 

98 approach that combines elements of both a non-recursive and recursive version of METE 

99 (McGlinn et al. 2013). We test those predictions using data from 16 spatially explicit plant 

100 communities and compare METE's performance to the classic Random Placement Model (RPM) 

101 in which individuals are randomly placed on the landscape (Coleman, 1981). Our approach 

102 provides a stronger evaluation of the performance of this model and whether it can explain 

103 patterns of spatial structure in the absence of detailed biological processes. 

104 Methods 

105 METE has thus far been used to derive the probability that a random cell on a landscape will 

106 be occupied by a given number of individuals (i.e., the intra- specific spatial abundance 

107 distribution). Predictions for this distribution have been based either on recursively subdividing 

108 an area in half or on predicting species abundances directly at smaller scales (Harte, 201 1 ; 

109 McGlinn et al., 2013). In addition to the spatial abundance distribution, the DDR requires a 

1 10 prediction for the correlations in abundance among neighboring cells, which has proven difficult 

111 to derive for METE (Harte 201 1). 

112 Developing METE 's Spatially Explicit Predictions 

113 METE's spatial predictions depend on two conditional probability distributions which are 

1 14 computed using independent applications of MaxEnt: 

115 1) the species abundance distribution (SAD), 0(n I So, No), the probability that a species has 

116 abundance n in a community with So species and No individuals, and 
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117 



2) the intra-specific spatial abundance distribution, U(n I A, no, Ao), the probability that n 



118 



individuals of a species with no total individuals are located in a random quadrat of area A 



119 



drawn from a total area Ao. 



120 The METE prediction for <J> is calculated using entropy maximization with constraints on the 

121 average number of individuals per species (No/ So) and the maximum number of individuals iVo 

122 for a given species, which yields a truncated log-series abundance distribution (Harte et al., 2008; 

123 Harte, 201 1). The spatially implicit II distribution is solved for using entropy maximization with 

124 constraints on the average number of individuals per unit area (no/Ao) and the maximum number 

125 of individuals no of a given species. Although METE requires information on total metabolic rate 

126 to derive its predictions, the exact value that this constraint takes has no influence on <J> and II 

127 (Harte et al, 2009; Harte, 201 1). 

128 Previous studies have downscaled (or upscaled) METE's predictions using recursive and 

129 non-recursive approaches. Here we develop a spatially explicit approach to downscaling 

130 METE's predictions that combines elements of both approaches and builds off an existing 

131 theoretical framework for modeling the DDR. With the recursive version of METE, O and n are 

132 solved for at each successive halving or bisection of Ao until the area of interest is reached. After 

133 each bisection, O and n are calculated and used to derive predicted values of average S and ./V at 

134 that scale which provide updated constraints for the next bisection (Harte et al. 2009). 

135 Alternatively, a non-recursive approach can be used in which, O and n at the spatial grain of 

136 interest can be solved for directly from the constraints placed at Ao (Harte et al. 2008). A semi- 

137 recursive approach is also possible in which n is recursively downscaled but O is not. The semi- 

138 recursive predictions of METE have not been previously examined but this model builds directly 

139 on the existing theoretical derivations of the DDR by Harte (2007) for the Hypothesis of Equal 
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140 Allocation Probabilities (HEAP). In Appendix A, Fig. Al and A2 we examine how the semi- 

141 recursive formulation of METE differs from a previous examination of the METE recursive and 

142 non-recursive SARs (McGlinn et al. 2013), and in Appendix B we develop the analytical 

143 derivations of the semi-recursive formulation of the DDR. 

144 In the semi-recursive formulation of the DDR, multi-cell correlations emerge from the 

145 spatially nested application of a recursive bisection scheme in which individuals are randomly 

146 placed in the left or right half of a cell at each bisection (Fig. 1). Biologically, this can be thought 

147 of as a sequentially dependent colonization rule in which individuals randomly choose to occupy 

148 the left or right side of an area depending on the existing number of individuals in each half 

149 (Harte et al. 2005, Harte 2007, and Conlisk et al. 2007). Our version of METE assumes that for 

150 a single bisection there is an equal likelihood for every possible spatial configuration of 

151 indistinguishable individuals (Eq. Bl). Multi-cell spatial correlations emerge from this approach 

152 because the two cells that are formed from a common parent cell are adjacent to one another and 

153 are likely to be more similar in abundance than other cells on the landscape (Fig. 1). This 

154 approach has three important and inter-related limitations: 1) At each stage in the bisection 

155 algorithm, information about the cells surrounding the parent cell is ignored when determining 

156 allocations within the parent cell, 2) between-cell distance is defined in reference to an artificial 

157 bisection scheme which does not have a one-to-one correspondence with physical distance, and 3) 

158 the correlation between cells does not decrease smoothly with physical distance. Alternative 

159 approaches have been proposed for deriving the DDR for METE based on computing the single - 

160 cell II distribution at two or more scales and then using the scaling of this marginal distribution 

161 to infer the probabilities of a given spatial configuration of abundance (Harte 201 1). However, 

162 these approaches have yet to yield predictions for the DDR. 
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163 The analytical forms of the semi-recursive formulation (Appendix B) are time-intensive 

164 to compute due to the multiple levels of recursion, ignore patterns of abundance (i.e., are 

165 formulated only in terms of presence-absence), and are not exact. An alternative approach to 

166 deriving semi-recursive METE predictions for the DDR is to use a spatially-explicit simulation. 

1 67 Spatially Explicit METE Simulation 

168 To simulate semi-recursive METE's spatial predictions, the equal probability rule (Eq. 

169 Bl) that METE assumes when total area is halved is recursively applied starting at the anchor 

170 scale Ao and progressively bisecting the area until the finest spatial grain of interest is achieved 

171 (Fig. 1). Abundance in the simulation model can be parameterized using an observed SAD or 

172 using a random realization of the METE SAD given the values of So and No- Once the 

173 abundances of the species are assigned, each species is independently spatially distributed. 

174 Because the equal probability rule requires that there is an equal probability of 0 to no individuals 

175 occurring on the left or right side of the total area Ao, the number of individuals in the left side 

176 can be set as a draw from a discrete random uniform distribution between 0 and no and the 

177 remaining number of individuals are placed on the right hand side. 

178 Datasets 

179 We used a database of 16 spatially explicit and contiguous community datasets compiled 

1 80 by McGlinn et al. (2013) to evaluate the DDR predictions of recursive METE (Table 1). All of 

181 the sites were terrestrial, woody plant communities with the exception of the serpentine grassland 

182 dataset which covered a terrestrial, herbaceous plant community. In the woody plant 

183 communities, all stems were recorded that were at least 10 mm in diameter at breast height (i.e., 

184 1.4 m from the ground) with the exception of the Oosting and Cross Timbers sites where the 

185 minimum diameter was 20 and 25 mm respectively. Recursive METE only generates predictions 
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186 for bisections of total area; therefore, we restricted our analysis to square or rectangular areas 

1 87 with a length-to- width ratio of 2: 1 . Two of the sites had irregular plot designs: Sherman and 

188 Cocoli. At these sites we partitioned the datasets into two 2:1 rectangles and analyzed each half 

189 independently and then averaged the results (see Supplemental Information: Fig. SI in McGlinn 

190 et al. 2013). See McGlinn et al. (2013) for additional information on site selection criteria, and 

191 in particular their Supplemental Table 1, which provides a more complete description of the 

192 datasets used in our analysis. 

193 Data Analysis 

194 We compared the fit of METE with and without the observed SAD and the random 

195 placement model (RPM) to the empirical DDRs. The METE predictions represented averages of 

196 the abundance-based S0rensen index across 200 simulated communities. The abundance-based 

197 RPM predictions were generated by distributing the observed number of individuals of each 

198 species randomly in space and then computing the average abundance-based S0rensen index 

199 across 500 permutations (Morion et at, 2008). 

200 The DDR is sensitive to the choice of the spatial grain of comparison (Nekola & White, 

201 1999); so, we examined the DDR at several spatial grains for each dataset. We examined spatial 

202 grains resulting from 3-13 bisections of Ao. To ensure that the samples at a given grain were 

203 square we only considered odd numbers of bisections when Ao was rectangular and even 

204 numbers of bisections when Ao was square. To ensure the best possible comparison between the 

205 observed data and METE and to avoid detecting unusual spatial artefacts in the METE predicted 

206 patterns we employed the "user rules" of Ostling et al. (2004) such that samples at a specific 

207 grain (i.e., level of bisection) were only compared if they were separated by a specific line of 

208 bisection (i.e., a given separation order, Fig. 1 and Appendix A, Fig. A3). This approach was 
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209 taken rather than the standard method of constructing the DDR from all possible pairwise sample 

210 comparisons without reference to an imposed bisection scheme. We computed geographic 

211 distance by averaging the distance between all the compared samples compared at a given 

212 separation order. For the Crosstimbers study site we were not able to examine the DDR based on 

213 the METE SAD because of difficulty in generating random realizations of the METE SAD 

214 needed for the community simulator when So is less than approximately 10. Typically averages 

215 of community similarity are used to examine the geometry of the DDR; however, in some cases 

216 the distribution of the similarity metric may be strongly skewed and therefore we computed both 

217 averages and medians of community similarity at each separation order. 

218 We used weighted least squares (WLS) regression to account for differences in the 

219 number of pairwise comparisons at different spatial lags (there are many more comparisons at 

220 short lags) when fitting the power and exponential models of the DDR (Venables & Ripley, 

221 2002). We examined the power model and exponential models because they are the simplest 

222 statistical models of the DDR, and it was recently suggested that at fine spatial scales the DDR 

223 should be best approximated by a power model (Nekola & White, 1999; Nekola & McGill, 

224 2014). 

225 We checked that our results were consistent with the results provided in previous studies 

226 (Harte, 2007, Fig. 6.7 and 6.8, 201 1, Fig. 4.1), and that the DDR generated by the community 

227 simulator closely agreed with the analytical solution Eq. B5 (Appendix B, Fig. Bl). The code to 

228 recreate the analysis is provided as Appendix D and at the following publicly available 

229 repository: http://dx.doi.org/10.6084/m9.figshare.978918 . 
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230 Results 

231 In general, the semi -recursive METE distance decay relationship (DDR) provided a poor 

232 fit to the empirical DDR (Figs. 2 and 3). The average and median community similarity results 

233 were highly correlated (r = 0.98) and generated qualitatively similar results (Appendix A Figs. 

234 A5 and A8); therefore, we focus on the results based on averaging similarity. While the METE 

235 DDRs exhibited the general functional form of the empirical DDRs, an approximately power-law 

236 decrease in similarity with distance, they typically had lower intercepts and steeper slopes than 

237 the empirical DDRs (Fig. 2, Appendix A, Fig. A4 and A6). Both the empirical and METE 

238 predicted DDR were better approximated by power rather than exponential models (Appendix A, 

239 Fig. A6). METE converged towards reasonable predictions at fine spatial grains; however, this is 

240 to be expected because at these scales similarity in both the observed and predicted patterns must 

241 converge to zero due to low individual density (grey points in Fig. 3A,B). This is because when 

242 individual density is low the probability of samples sharing species decreases rapidly simply due 

243 to chance. The RPM is known to be a poor model for distance decay because it does not exhibit a 

244 decrease in similarity with distance. However, it fit the empirical DDR slightly better than 

245 METE (Figs. 2 and 3). 

246 The METE DDR was not strongly influenced by the choice of using the observed or the 

247 METE SAD (Figs. 2 and 3A,B). The METE SAD typically yielded a DDR with a slightly lower 

248 intercept with the exception of the four tropical sites where it produced DDRs with slightly 

249 higher intercepts. In general, we did not observe strong consistent differences between the 

250 habitat types (Fig. 2, Appendix A, Fig. A7). 

251 Our formulation of a semi -recursive METE produced SARs that generally agreed (i.e., 

252 within the 95% CI) with the recursive and non-recursive formulations of METE (Harte et al. 
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253 2009); however, it did appear that the semi-recursive approach systematically deviated towards 

254 lower richness at fine spatial scales which is consistent with predicting stronger patterns of 

255 spatial aggregation compared to the other formulations of METE (Appendix A, Fig. Al and A2). 

256 Discussion 

257 The semi-recursive METE distance decay relationship (DDR) was well approximated by 

258 a decreasing power function, and thus consistent with the general form of empirical DDRs, but it 

259 provided a poor fit to empirical data. Specifically, the slope and the intercept of this power 

260 function deviate substantially from empirical data resulting in a poor fit. These deviations 

261 contrast with a number of studies showing that the theory successfully predicts both the II 

262 distribution and the SAR (Harte et al, 2008, 2009; Harte, 201 1; McGlinn et al, 2013; but see 

263 Sizling et al, 201 1). Both II and the SAR are influenced by the spatially explicit pattern of 

264 intraspecific aggregation but neither pattern reflects inter-quadrat correlations and therefore they 

265 represent coarse metrics of spatial structure. The combination of a well fit SAR and a poorly fit 

266 DDR suggests that the current version of METE accurately characterizes average occupancy, but 

267 fails to characterize the spatial relationships among cells (McGeoch & Gaston, 2002; Storch et 

268 al, 2003; McGlinn & Hurlbert, 2012; Nekola & McGill, 2014). 

269 These results only apply directly to the particular HEAP-based semi-recursive version of 

270 the spatial METE theory, which represents a middle ground in terms of approach between Harte 

271 et al. (2008) and Harte et al. (2009). Other approaches to deriving the METE DDR may perform 

272 better than the semi-recursive approach if they can be developed. It has been suggested that there 

273 is no a priori reason to prefer one version of the theory and that the best way to choose among 

274 the different versions is empirically (Haegeman & Etienne, 2010; Harte, 2011). However, the 

275 traditionally defined recursive and non-recursive versions of METE have shortcomings with 
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276 respect to how their assumptions and predictions are scaled, and the semi-recursive approach we 

277 defined is limited by its dependence on an artificial bisection scheme. Specifically the recursive 

278 approach predicts that the SAD has the same functional form, a truncated log-series, at all scales. 

279 This is problematic because SADs are typically not scale-invariant if, as METE predicts, species 

280 display intraspecific spatial aggregation (Green & Plotkin, 2007; Sizling et al, 2009). The non- 
281 recursive approach does not suffer from this problem because the SAD is only solved for at the 

282 anchor scale; however, Haegemann and Etienne (2010) found that the non-recursive predictions 

283 for a multi-cell generalization of the n distribution were scale-inconsistent. The semi-recursive 

284 approach does not suffer from this shortcoming because its multi-cell form (see Eq. 2.2 in 

285 Conlisk et ah, 2007) is only defined over the set of bisections that are consistent with a landscape 

286 in which no individuals are distributed (see Appendix C for proof). However, the set of bisections 

287 is artificial and multi-cell correlations only emerge from this approach in reference to bisection 

288 distance rather than directly to physical distance between cells such that cells have equal 

289 magnitude of correlation regardless of their physical distance if they have equivalent separation 

290 orders (see Conlisk et ah, 2007 for a critique of distances defined by separation indices). An 

291 important future direction for METE is to attempt to develop spatial multi-cell predictions using 

292 approaches that avoid these shortcomings and the two approaches suggested by Harte (201 1) for 

293 deriving the METE DDR may provide a useful starting point for future development. 

294 Our results suggest that semi-recursive METE differs from spatial patterns observed in 

295 nature. This deviation could indicate that the emergent statistical approach to modeling spatial 

296 structure is incorrect, with specific biological processes such as dispersal limitation or 

297 environmental filtering directly controlling spatial correlation (Condit et ah, 2002; Gilbert & 

298 Lechowicz, 2004; Karst et al, 2005; Seidler & Plotkin, 2006; Chase, 2007; McGlinn & Palmer, 
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299 201 1). Alternatively it could mean that while the general idea underlying the theory is valid, the 

300 specific formulation is wrong. For example it could be that the approaches outlined by Harte 

301 (201 1) that are more sophisticated in how they handle spatial correlations will be more 

302 appropriate or that a generalized version of this kind of recursive approach like that developed by 

303 Conlisk et al. (2007) in which the degree of aggregation is a tunable parameter will capture the 

304 reality of biological systems more precisely. However, process-, and constraint-based models 

305 should not necessarily be treated as mutually exclusive. For example other process-based 

306 theories make power-law like predictions for the form of the DDR. In fact, it has recently been 

307 suggested that at fine spatial scales most theories will make predictions that are approximately 

308 power-law in nature (Nekola & McGill, 2014). This means that simply noting power-law like 

309 DDR relationships does not provide a strong method for differentiating among theories. In fact, 

310 had we simply looked for power-law like behavior we would have concluded that the semi- 

311 recursive METE was consistent with empirical data. However, one of the properties that makes 

312 METE such a strong theory is that it makes specific predictions for precise parameters as well as 

313 general forms of empirical relationships. This allows it to be more rigorously compared to data 

314 and to other theories that predict different parameters values for a similar general form of the 

315 DDR (e.g., neutral theory) . 

316 Our results mirror those of Xiao et al. (2013) and Newman et al. (2014) evaluating the 

317 non-spatial aspects of METE. All three studies show that when evaluating the theory using 

318 multiple patterns simultaneously some of the predictions perform well and some perform poorly. 

319 It is inherently difficult for theories to predict large numbers of patterns simultaneously, which is 

320 why evaluating theory in this way provides stronger tests than evaluating single patterns (McGill, 

321 2003; McGill et al., 2006). General theories like METE that make multiple predictions are 
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322 therefore both easier to evaluate and also more broadly useful since they allow a large number of 

323 patterns to be predicted from a relatively small amount of information. Because there are many 

324 patterns to evaluate it is also more likely that deviations from theory will be identified (White et 

325 al., 2012). In some cases these deviations may indicate that the theory is fundamentally unsound, 

326 but in others it may suggest modifications to the theory to address the observed deviations 

327 (White et al., 2012). Whether METE can be modified to address the observed deviations from 

328 empirical data remains to be seen. In the case of the DDR, despite its generality, there are a 

329 limited number of models that attempt to predict the DDR from first principles (Chave & Leigh, 

330 2002; Condit et al, 2002; Zillio et al, 2005; Harte, 2007, 201 1; Nekola & McGill, 2014), which 

331 means that it may be worth pursuing the METE approach further. 

332 METE is one of several general theories in ecology that make many predictions for many 

333 aspects of ecological community structure based on only a small amount of information. Our 

334 analysis of the semi-recursive formulation of METE's spatially explicit prediction for the DDR 

335 suggests that this form of the theory over-predicts the strength of spatial correlation. These 

336 results coupled with studies of the species-area relationship suggest that semi-recursive METE 

337 accurately predicts the scaling of species occupancy but not spatial correlation. More generally, 

338 our results demonstrate that tests of spatial theories that focus solely on the species-area 

339 relationship and related patterns are only evaluating part of the spatial pattern, the distribution of 

340 occupancy among cells. Evaluating these theories using the DDR in addition to the SAR will 

341 help identify cases where the theories are correctly identifying some aspects of spatial structure, 

342 but not others, and thus yield stronger tests of the underlying theory. In some cases this will 

343 require extending the theory to make additional predictions, but this effort will provide both 

344 more testable and more usable theories. 
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498 Tables: 

499 Table 1. Summary of the habitat type and state variables of the vegetation datasets. The state 

500 variables are total area (Ao), total abundance (No) and total number of species (So). A m in and A max 

501 are the finest and coarsest areas (m 2 ) examined. Data were collected on woody forest plants with 

502 the exception of the serpentine site which contained herbaceous grassland plants. 



Site name 


Habitat type 


Ref 


Amin 


Amax 


Ao 


No 


So 


BCI 


tropical 


1-3 


61.0 


62500 


500000 


205096 


301 


Sherman 


tropical 


4 


2.4 


625 


20000 


7623 


175 


Cocoli 


tropical 


4 


2.4 


625 


20000 


4326 


139 


Luquillo 


tropical 


5 


15.3 


15625 


125000 


32320 


124 


Bryan 


oak-hickory 


6-8 


2.1 


535 


17113 


3394 


48 


Big Oak 


oak-hickory 


6-8 


2.4 


625 


20000 


5469 


40 


Oosting 


oak-hickory 


9 


16 


4096 


65536 


8892 


39 


Rocky 


oak-hickory 


6-8 


3.5 


900 


14400 


3383 


37 


Bormann 


oak-hickory 


6-8 


4.8 


1225 


19600 


3879 


30 


Wood Bridge 


oak-hickory 


6-8 


1.2 


315 


5041 


758 


19 


Bald Mtn. 


oak-hickory 


6-8 


2.4 


156 


5000 


669 


17 


Landsend 


old field, pine 


6-8 


1.0 


264 


8450 


2139 


41 


Graveyard 


old field, pine 


6-8 


2.4 


625 


10000 


2584 


36 


ucsc 


mixed-evergreen 


10 


5.4 


1406 


45000 


5885 


31 


Serpentine 


serpentine 


11 


0.3 


4 


64 


37182 


24 


Cross Timbers 


oak woodland 


12 


9.8 


2500 


40000 


7625 


7 


Ranges 






0.3-61.0 


4-62500 


64-500000 


669-205096 


7-301 



503 1 Condit (1998), 2 Hubbell et al. (1999), 3 Hubbell et al. (2005), 4 - Condit et al. (2004), 5 - 

504 Zimmerman et al. (1994), 6 Peet and Christensen (1987), 7 McDonald et al. (2002), 8 - Xi et al. 

505 (2008), 9 Palmer et al. (2007), 10 Gilbert et al. (2010), 11 Green et al. (2003), 12 - Arevalo (2013) 
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506 Figures 
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Fig 1. This diagram illustrates the "user rules" of how a landscape is bisected and how samples 
are compared for a given separation order. In this specific example, three bisections are used to 
generate a spatially explicit distribution of 10 individuals. In the last panel, the eight pairwise 
comparisons (arrows) at separation order of 2 for a scale of Ao/2 3 (i.e., A,=3, Dj=2) are illustrated. 
When simulating random bisections the number of individuals distributed to the left or right of 
the bisection line is a random draw from a discrete uniform distribution. 
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Fig 2. The observed (black line with dots) and predicted distance decay relationships (METE: 
dark grey lines, solid for the observed SAD, dashed for the METE SAD; random placement: 
light grey line) for each site at a single spatial grain. Community similarity represents the 
average of the abundance-based S0rensen index for each spatial lag. The spatial grain displayed 
was taken at either 8 or 9 bisections of the total area depending on whether the total extent was a 
square or a rectangle respectively. Geographic distance was calculated as the average physical 
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524 distance between the samples compared at given separation order (see Methods and Fig.l for 

525 additional information). 
526 
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(A) recursive, METE SAD (B) recursive, observed SAD (C) random, observed SAD 
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527 Predicted Simlarity 

528 Fig 3. The log-log transformed one-to-one plots of the predicted and observed abundance-based 

529 S0rensen similarity values for the three models across all distances and spatial grains. The solid 

530 line is the one-to-one line. The grey points represent values from spatial grains in which the 

531 average individual density was low (i.e., less than 10 individuals) and thus both the observed and 

532 predicted similarities must be close to zero simply because of a sampling effect. 



